Refine your search
Journals
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Jayanthi, S. K.
- Comprehensive Evaluation of Machine Learning Techniques and Novel Features for Web Link Spamdexing Detection
Abstract Views :461 |
PDF Views:264
Authors
Affiliations
1 Department of Computer Science, Vellalar College for Women, Erode, IN
2 Department of Computer Science, KSR College of Arts and Science, Tiruchengode, IN
3 Kumaraguru College of Technology, Coimbatore, IN
1 Department of Computer Science, Vellalar College for Women, Erode, IN
2 Department of Computer Science, KSR College of Arts and Science, Tiruchengode, IN
3 Kumaraguru College of Technology, Coimbatore, IN
Source
ScieXplore: International Journal of Research in Science, Vol 1, No 2 (2014), Pagination: 98-109Abstract
World Wide Web (WWW) is a huge, dynamic, self-organized, and strongly interlinked source of information. Search engine became a vital IR (Information Retrieval) system to retrieve the required information. Results appearing in the first few pages gain more attraction and importance. Since users believe that they were more relevant because of its top positions. Spamdexing plays a key role in making high rank and top visibility for an undeserved page. This paper focus on two aspects: new features and new classifiers. First, 27 new features which are used to commercially boost the ranking and reputation are considered for classification. Along with them 17 new features were proposed and computed. Totally 44 features were combined with the existing WEBSPAM-UK 2007 dataset which is the baseline. With all these features, feature inclusion study is carried out to elevate the performance. Second aspect considered in this paper is exploring new suite of five different machine learners for the web spam classification problem. Results are discussed. New feature inclusion improves the classification accuracy of the publicly available WEBSPAM-UK 2007 features by 22%. SVM outperforms well than the other methods in terms of accuracy.Keywords
Decision Table, HMM, Search Engine, SVM, Web Spam.References
- Egele M., Kolbitsch C., and Platzer C., “Removing Web Spam Links from Search Engine Results”, Journal of Computational Virology, Springer-Verlag, France, 2009.
- Delany S.J., Cunningham P., and Coyle L., “An Assessment of Case-Based Reasoning for Spam Filtering”, Springer Artificial Intelligence Review, p. 359–378, 2005.
- Chung Y., Toyoda M., and Kitsuregawa M., “Identifying Spam Link Generators for Monitoring Emerging Web Spam”, WICOW’10, North Carolina, USA, 2010. p. 51–58.
- Erdelyi M., Garzo A., and Benczur A., “Web spam classification: a few features worth more”, WICOW/AIRWeb Workshop on Web Quality, India, 2011. p. 27–34.
- Karimpour J., Noroozi A., and Abadi A., “The Impact of Feature Selection on Web Spam Detection”, I.J. Intelligent Systems and Applications, p. 61–67, 2012.
- Geng G., Wang C.H., and Dan Li Q., “Improving Web Spam Detection with Re-Extracted Features”, WWW 2008, Beijing, China. 2008. ACM, p. 1119–1120.
- Benczur A., Bıro I., Csalogany K., and Sarlos T., “Web spam detection via commercial intent analysis”, 3rd International Workshop on Adversarial Information Retrieval on the Web, AIRWeb’07. 2007.
- Gan Q., and Suel T., “Improving Web Spam Classifiers Using Link Structure”, AIRWeb ’07, Canada. 2007.
- Jayanthi S.K., Sasikala S., “WESPACT: Detection of Web Spamdexing with Decision Trees in GA Perspective”, International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012), Periyar University, Salem, IEEE Xplore, Listed in SCOPUS, 2012 Mar 21–23. p. 381–386.
- Jayanthi S.K., and Sasikala S., “REPTree Classifier for Identifying Link Spam in Web Search Engines”, Ictact Journal On Soft Computing, vol. 3(2), p. 498–505, 2007
- Jayanthi S.K., Sasikala S., “Web Link Spam Identification Inspired By Artificial Immune System and the Impact of TPP-FCA Feature Selection on Spam Classification”, Ictact Journal On Soft Computing, vol. 4(1), p. 633–644, 2013 Oct.
- Jayanthi S.K., Sasikala S., “Naïve Bayesian Classifier and PCA for Web Link Spam Classification”, Georgian Electronic and Scientific Journal, GESJ: Computer Science and Telecommunications, vol. 1(41), 2014 Mar.
- Tian Y., Weiss G.M., and Ma Q., “A Semi-Supervised Approach for Web Spam Detection using Combinatorial FeatureFusion”, Graph labeling workshop and web spam challenge in European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, 2010. p. 16–23.
- Radicati01. Available: www.radicati.com, Accessed on Nov 2010.
- Radicati02. Available: http://www.radicati.com/wp/wpcontent/ uploads/2013/05/Corporate-Web-Security-Market2013-2017-Executive-Summary.pdf, Accessed on Oct 2013.
- Symantec, Symantec Intelligence Report, b-intelligence_ report_08-2013.en-us, Accessed on Aug 2013.
- WWWsize. Available: http://www.worldwidewebsize.com/, Accessed on Nov 2013.
- Wiki02. Available: http://en.wikipedia.org/wiki/Machine_ learning, Accessed on 2013.
- Wiki03. Available: http://en.wikipedia.org/wiki/Feature_ selection, Accessed on 2013.
- Dmoz open directory
- Available: www.google.com
- iwebtool, Available: http://www.iwebtool.com/pagerank_ prediction, Accessed on 2012.
- WEKA, Available: www.cs.waikato.ac.nz/ml/weka/
- Edge Detection Using Multispectral Thresholding
Abstract Views :139 |
PDF Views:2
Authors
Affiliations
1 Department of Computer Science, J.K.K. Nataraja College of Arts & Science, IN
2 Department of Computer Science, Vellalar College for Women, IN
1 Department of Computer Science, J.K.K. Nataraja College of Arts & Science, IN
2 Department of Computer Science, Vellalar College for Women, IN
Source
ICTACT Journal on Image and Video Processing, Vol 6, No 4 (2016), Pagination: 1267-1272Abstract
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and extraction. Among various edge detection methods, Otsu method is one of the best optimal thresholding methods for general real world images with regard to uniformity and shape measures. In this paper, a multispectral thresholding algorithm using Otsu method is proposed to detect the edges in multispectral images. Natural, art and simulated images are considered for testing. Since the edges are well known in the simulated images, they are considered for performance evaluation. The results of proposed method, Edge Detection using MultiSpectral Thresholding (EDMST), are compared against the results of Canny Otsu, Improved Otsu, Median based Otsu and Improved Gray Image Otsu edge detection algorithms based on the human visual system, the number of edges and the number of pixels. The experimental results show that the proposed method achieves better performance and hence applied on Satellite images.Keywords
Edge Detection, Multispectral Thresholding, Otsu Method, Satellite Images, EDMST.- NLSDF for Boosting the Recital of Web Spamdexing Classification
Abstract Views :192 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, Hindusthan College of Arts and Science, IN
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, Hindusthan College of Arts and Science, IN
Source
ICTACT Journal on Soft Computing, Vol 7, No 1 (2016), Pagination: 1324-1331Abstract
Spamdexing is the art of black hat SEO. Features which are more influential for high rank and visibility are manipulated for the SEO task. The motivation behind the work is utilizing the state of the art Website optimization features to enhance the performance of spamdexing detection. Features which play a focal role in current SEO strategies show a significant deviation for spam and non-spam samples. This paper proposes 44 features named as NLSDF (New Link Spamdexing Detection Features). Social media creates an impact in search engine results ranking. Features pertaining to the social media were incorporated with the NLSDF features to boost the recital of the spamdexing classification. The NLSDF features with 44 attributes along with 5 social media features boost the classification performance of the WEBSPAM-UK 2007 dataset. The one tailed paired t-test with 95% confidence, performed on the AUC values of the learning models shows significance of the NLSDF.Keywords
Web Spam, Search Engine, SVM, Decision Table, HMM.- Web Link Spam Identification Inspired by Artificial Immune System and the Impact of TPP-FCA Feature Selection on Spam Classification
Abstract Views :156 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
Source
ICTACT Journal on Soft Computing, Vol 4, No 1 (2013), Pagination: 633-644Abstract
Search engines are the doorsteps for retrieving required information from the web. Web spam is a bad method for improving the ranking and visibility of the web pages in search engine results. This paper addresses the problem of the link spam classification through the features of the web sites. Link related features retrieved from the website are used to discriminate the spam and non-spam sites. AIS inspired algorithms are applied for the dataset and results are evaluated. Artificial immune systems are machine learning systems inspired by the principles of the natural immunology. It comprises of supervised learning schemes which can be adapted for the wide range of the classification problems.UK- WEBSPAM-2007 Dataset [8] is used for the experiments. WEKA [9] is used to simulate the classifiers. Artificial Immune Recognition algorithm seems to perform well than the other classes. Best classification accuracy attained is 98.89 by AIRS1 Algorithm. This seems to be good when comparing with the other classifiers accuracy available on the existing literature.Keywords
Web Spam, Search Engine, TPP, FCA, AIRS.- Measuring the Performance of Similarity Propagation in an Semantic Search Engine
Abstract Views :169 |
PDF Views:0
Authors
S. K. Jayanthi
1,
S. Prema
2
Affiliations
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
Source
ICTACT Journal on Soft Computing, Vol 4, No 1 (2013), Pagination: 667-672Abstract
In the current scenario, web page result personalization is playing a vital role. Nearly 80 % of the users expect the best results in the first page itself without having any persistence to browse longer in URL mode. This research work focuses on two main themes: Semantic web search through online and Domain based search through offline. The first part is to find an effective method which allows grouping similar results together using BookShelf Data Structure and organizing the various clusters. The second one is focused on the academic domain based search through offline. This paper focuses on finding documents which are similar and how Vector space can be used to solve it. So more weightage is given for the principles and working methodology of similarity propagation. Cosine similarity measure is used for finding the relevancy among the documents.Keywords
Semantic Web, BookShelf Data Structure, Similarity Propagation, Cosine Similarity measure, Vector Space Model.- Reptree Classifier for Identifying Link Spam in Web Search Engines
Abstract Views :151 |
PDF Views:0
Authors
Affiliations
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, KSR College of Arts and Science, IN
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, KSR College of Arts and Science, IN
Source
ICTACT Journal on Soft Computing, Vol 3, No 2 (2013), Pagination: 498-505Abstract
Search Engines are used for retrieving the information from the web. Most of the times, the importance is laid on top 10 results sometimes it may shrink as top 5, because of the time constraint and reliability on the search engines. Users believe that top 10 or 5 of total results are more relevant. Here comes the problem of spamdexing. It is a method to deceive the search result quality. Falsified metrics such as inserting enormous amount of keywords or links in website may take that website to the top 10 or 5 positions. This paper proposes a classifier based on the Reptree (Regression tree representative). As an initial step Link-based features such as neighbors, pagerank, truncated pagerank, trustrank and assortativity related attributes are inferred. Based on this features, tree is constructed. The tree uses the feature inference to differentiate spam sites from legitimate sites. WEBSPAM-UK-2007 dataset is taken as a base. It is preprocessed and converted into five datasets FEATA, FEATB, FEATC, FEATD and FEATE. Only link based features are taken for experiments. This paper focus on link spam alone. Finally a representative tree is created which will more precisely classify the web spam entries. Results are given. Regression tree classification seems to perform well as shown through experiments.Keywords
Web Link Spam, Classification, Reptree, Decision Tree, Search Engine.- Improving Personalized Web Search Using Bookshelf Data Structure
Abstract Views :166 |
PDF Views:0
Authors
S. K. Jayanthi
1,
S. Prema
2
Affiliations
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
1 Department of Computer Science, Vellalar College for Women, IN
2 Department of Computer Science, K.S.R College of Arts and Science, IN
Source
ICTACT Journal on Soft Computing, Vol 3, No 1 (2012), Pagination: 434-439Abstract
Search engines are playing a vital role in retrieving relevant information for the web user. In this research work a user profile based web search is proposed. So the web user from different domain may receive different set of results. The main challenging work is to provide relevant results at the right level of reading difficulty. Estimating user expertise and re-ranking the results are the main aspects of this paper. The retrieved results are arranged in Bookshelf Data Structure for easy access. Better presentation of search results hence increases the usability of web search engines significantly in visual mode.Keywords
Web Search Personalization, Bookshelf Data Structure, Agglomerative Hierarchical Clustering, Similarity Measure, Visualization.- X GraphticsCLUS:Web Mining Hyperlinks and Content of Terrorism Websites for Homeland Security
Abstract Views :135 |
PDF Views:0
Authors
Affiliations
1 Computer Science Department, Vellalar College for Women, Erode, IN
2 Computer Science Department, KSR College of Arts and Science, Tiruchengode, IN
1 Computer Science Department, Vellalar College for Women, Erode, IN
2 Computer Science Department, KSR College of Arts and Science, Tiruchengode, IN